A non-parametric k-nearest neighbour entropy estimator
نویسندگان
چکیده
A non-parametric k-nearest neighbour based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering non-uniform probability densities in the region of k-nearest neighbours around each sample point. It aims at improving the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator and shown to be a significant improvement.
منابع مشابه
Asymptotic Behaviors of Nearest Neighbor Kernel Density Estimator in Left-truncated Data
Kernel density estimators are the basic tools for density estimation in non-parametric statistics. The k-nearest neighbor kernel estimators represent a special form of kernel density estimators, in which the bandwidth is varied depending on the location of the sample points. In this paper, we initially introduce the k-nearest neighbor kernel density estimator in the random left-truncatio...
متن کاملEfficient multivariate entropy estimation via k-nearest neighbour distances
Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of the entropy of a distribution. In this paper, we seek entropy estimators that are efficient in the sense of achieving the local asymptotic minimax lower bound. To this end, we initially study a generalisation of the estimator originally proposed by Ko...
متن کاملA Nearest-Neighbour Approach to Estimation of Entropies
The concept of Shannon entropy as a measure of disorder is introduced and the generalisations of the Rényi and Tsallis entropy are motivated and defined. A number of different estimators for Shannon, Rényi and Tsallis entropy are defined in the theoretical part and compared by simulation in the practical part. In this work the nearest neighbour estimator presented in Leonenko and Pronzato (2010...
متن کاملEstimating Individual Tree Growth with the k-Nearest Neighbour and k-Most Similar Neighbour Methods
The purpose of this study was to examine the use of non-parametric methods in estimating tree level growth models. In non-parametric methods the growth of a tree is predicted as a weighted average of the values of neighbouring observations. The selection of the nearest neighbours is based on the differences between tree and stand level characteristics of the target tree and the neighbours. The ...
متن کاملGeneralized K-Nearest Neighbour Algorithm- A Predicting Tool
k-nearest neighbour algorithm is a non-parametric machine learning algorithm generally used for classification. It is also known as instance based learning or lazy learning. K-NN algorithm can also be adapted for regression that is for estimating continuous variables. In this research paper the researcher endow with a generalized K-nearest algorithm used for predicting a continuous value. In or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1506.06501 شماره
صفحات -
تاریخ انتشار 2015